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Abstract 

Let M be a random non-uniform hypergraph of dimen¬ 

sion M on 2n vertices, where the vertices are split into two disjoint sets 
of size n, and colored by two distinct colors. Each non-monochromatic 
edge of size m = is independently added with probability 

Pm- We show that if p 2 , • ■ • ,Pm are such that the expected number of 
edges in the hypergraph is at least dnlnn, for some d > 0 sufficiently 
large, then with probability (1 — o(l)), one can find a proper 2-coloring 
of Hn,(pm)m =2 M polynomial time. We present a polynomial time 
algorithm for hypergraph 2-coloring, and provide discussions on exten¬ 
sion of the approach for fc-coloring of non-uniform hypergraphs. 


1 Introduction 


A hypergraph H = (G, E) is said to be bipartite or 2-colorable if the vertex 
set V can be partitioned into two disjoint sets Vi and V 2 such that every edge 
e G A has non-empty intersections with both the partitions. In the case of 
graphs, one can easily find the two partitions from any given instance of H 
by breadth first search. However, the problem turns out to be notoriously 
hard if edges of size more than 2 are present. In fact, in the case of bipartite 
3-uniform and 4-uniform hypergraphs, it is well known that the problem is 


NP-hard mm- 

In general, finding a proper 2-coloring is relatively easy if the hypergraph 
is sparse. In an answer to a question asked by Erdos [12] on 2-colorability 
of uniform hypergraphs, it is now known that for large m, any m-uniform 


hypergraph on n vertices with at most 2^0.7 


m 

Inm 


edges is 2-colorable 


As pointed in [25|, the result can also be extended to non-uniform hyper¬ 
graphs with minimum edge size m. However, it is much worse if the restric¬ 
tion on the minimum edge size and the number of hyperedges is not imposed. 
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Even when a hypergraph is 2-colorable, the best known algorithms mu 
require O colors to properly color the hypergraph in poly¬ 

nomial time, where M is the maximum edge size, also called dimension, of 
the hypegraph. In recent years, 2-colorability of random hypergraphs has 
also received considerable attention. Through a series of works [DEI EH, 
it is now established that random uniform hypergraphs are 2-colorable only 
when the number of edges are at most Cn, for some constant C > 0. Thus, 
it is evident that coloring relatively dense hypergraphs is difficult unless the 
hypergraph admits a “nice” structure. 

In spite of the hardness of the problem, there are a number of applications 
that require hypergraph coloring algorithms. For instance, such algorithms 
have been used for approximate DNF counting [18], as well as in various 
resource allocation and scheduling problems mM- The connection between 
“Not-All-Equal” (NAE) SAT and hypergraph 2-coloring also demonstrate 
its signihcance in context of satisfiability problems. Among the various ap¬ 
proaches studied in the literature, perhaps the only known non-probabilistic 
instances of efficient 2-coloring are in the cases where the hypergraph is a- 
dense, 3-uniform and bipartite [S], or where the hypergraph is m-uniform 
and its every edge has equal number of vertices of either colors [T9| . 

In this paper, we consider the problem of coloring random non-uniform 
hypergraphs of dimension M, that has an underlying planted bipartite struc¬ 
ture. We present a polynomial time algorithm that can properly 2-color 
instances of the random hypergraph with high probability whenever the ex¬ 
pected number of edges in at least dnlnn for some constant d > 0. To the 
best of our knowledge, such a model has been only considered by Chen and 
Frieze [8], who extended a graph coloring approach of Alon and Kahale [3] 
to present an algorithm for 2-coloring of 3-uniform bipartite hypergraphs 
with dn number of edges. To this end, our work generalizes the results of 
|2] to non-uniform hypergraphs, and it is the first algorithm that is guar¬ 
anteed to properly color non-uniform bipartite hypergraphs using only two 
colors. We also discuss the possible extension of our approach to the case of 
non-uniform /c-colorable hypergraphs. 

The Main Result 

Before stating the main result of this paper, we present the planted model 
under consideration, which is based on the model that is studied in M- 
The random hypergraph Hn,{pjn)m=2 m generated on the set of vertices 
V = {1,2,... ,2n}, which is arbitrarily split into two sets, each of size n, 
and the sets are colored with two different colors. Given a integer M, and 
P 2 , ■ ■ ■ iPm £ [0,1], the edges of the hypergraph are randomly added in the 
following way. All the edges of size at most M are added independently. 
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and for any e C V 


P(e € 

We prove the 

Theorem 1. 


TP\ f Pm if e is not monochromatic and |e| = m, 

^ 0 otherwise. 

following result. 

Assume M = 0(1). There is a constant d > 0 sueh that 


if 


M 

Y^Pm 

m=2 



> dnlnn, 


( 1 ) 


then with probability (1 — o(l)), Alaorithm I COL 0~^ (presented in next see- 
tion) finds a proper 2-coloring of the random non-uniform bipartite hyper- 
graph 

It is easy to see that the expected number of edges in the hypergraph is 
e Pmi^Y) 1 condition may be stated in terms of expected 

number of edges. 


Organization of this paper 

The rest of the paper is organized in the following manner. In Section [2l we 
present our coloring algorithm, followed by a proof of Theorem[T]in Section^ 
In the concluding remarks in Section 01 we provide discussions about the 
key assumptions made in this work, and also the possible extensions of our 
results to A;-coloring and strong coloring of non-uniform hypergraphs. The 
appendix contains proofs of the lemmas mentioned in Section [3l 


2 Spectral algorithm for hypergraph coloring 


The coloring algorithm, presented below, is similar in spirit to the spectral 
methods of HIS], but certain key differences exist, which are essential to 
deal with non-uniform hypergraphs. 

Given a hypergraph H = {V,E), an initial guess of the color classes is 
formed by exploiting the spectral properties of a certain matrix A G 
defined as 


Aij — < 


E ^ 


eGE'.eBiJ 


E n 


e£E:eBi 


if i 7 ^ j, and 
if i = j. 


( 2 ) 


The above matrix has been used in the literature to construct the Laplacian 
of a hypergraph mm, and is also known to be related to the affinity matrix 
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of the star expansion of hypergraph [2] . The use of matrix A is in contrast 
to the adjacency based graph construction of [8] that is likely to result in a 
complete graph if the hypergraph is dense. 

The later stage of the algorithm considers an iterative procedure that is 
similar to mi, but uses a weighted summation of neighbors. Such weighting 
is crucial while dealing with the edges of different sizes. 


Algorithm COLOR ~ Colors a non-uniform hypergraph H: 

1: Define the matrix A as in ([2|) . 

2: Compute = arg min x'^Ax. 

\\x\\2 = l 

3: Let T = [log 2 n], = {i E C : xf > 0} and = {z E C : xf < 0}. 

4: for t = 1,2,... ,T do 

5: Let = lie V : E < E AiA, 

( jevAAii} jevA"\{i} } 

and = V\V}^\ 

6: end for 

7: a 3e e E such that e C or e C then 
8: Algorithm FAILS. 

9: else 

(T) (T) 

10: 2-Color V according to the partitions , E' ■ 

11: end if 


3 Proof of Main Result 

We now prove Theorem [TJ Without loss of generality, assume that the 
true color classes in V are {1,2,..., n} and {n -|- 1,..., 2n}. Also, let 
t = 0,1,... ,T, denote the incorrectly colored vertices after iteration t, with 
being the incorrectly colored nodes after initial spectral step. We prove 
Theorem [T] by showing with probability (1 — o(l)), the size of < 1, 

which implies that all nodes are correctly colored, and hence, the hypergraph 
must be properly colored. 

The first lemma bounds the size of i.e., the error incurred at the 

initial spectral step. 

Lemma 1. With probability (1 — o(l)), < j^ 2 ^m+ 4 : ' 

Next, we analyze the iterative stage of the algorithm to make the follow¬ 
ing claim, which characterizes the vertices that are correctly colored after 
iteration t. 
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Lemma 2. Let rj 


1 ^ Pm{n - 1) ^- 2 

2M+2 ^ Vm-2 

if Y1 ^ij < ^ /o’" o.'ny i & V, then P{i G PL^) < . 

ieVF(‘-i)\{i} 

Note that there are only T = [log 2 n] iterations, and \V\ = 2n. Com¬ 
bining the result of Lemma [2] with union bound, we can conclude that with 
probability (1 — o(l)), for all iterations t = 1, 2,... , T, there does not ex¬ 
ist any i ^ V such that ^ Aij < rj. We also make the following 

ieVK(‘-i)\{d 

observation, where tj is defined in Lemma [21 

Lemma 3. With probability (1 — o(l)), there does not exist Ci, C 2 C V such 

that |Ci| < j^222m+4 ; 1^21 = pic'll and for all i G C 2 , ^ij > V- 

i6Ci\{d 

We now use the above lemmas to proceed with the proof of Theorem [TJ 
Lemma [H shows that |W®| < with probability (1 — o(l)). Condi¬ 

tioned on this event, and due to the conclusion of Lemma [21 one can argue 
that Lemma [3l is violated unless for all iteration t with 

probability (1 — o(l)). Thus, in each iteration, the number of incorrectly 
colored vertices are reduced by at least half. Hence, after T = [log 2 n] 
iterations, < 1, which implies that all vertices are correctly colored. 

4 Discussions and Concluding remarks 

In this paper, we showed that a random non-uniform bipartite hypergraph 
of dimension M with balanced partitions can be properly 2-colored with 
probability (1 —o(l)) by a polynomial time algorithm. The proposed method 
uses a spectral approach to form initial guess of the color classes, which is 
further refined iteratively. To the best of our knowledge, this is the first work 
on 2-coloring bipartite non-uniform hypergraphs. Previous works mm 
have only restricted to the case of uniform hypergraphs. 

A note on the assumptions in Theorem [U 

The key assumptions made in this paper are the following: 

1. M = 0(1), and 

2. P 2 , ■ ■ ■ ,Pm are such that the expected number of edges is larger than 
dnlnn, where d > 0 is a large constant. 

The assumption M = 0(1) is crucial, particularly in Lemma [U and helps 
to ensure that d can be chosen to be a constant. This can be avoided if d is 
allowed to increase with n appropriately. We note that a previous work on 


For any t G {1,... ,T}, 


5 






spectral hypergraph partitioning m allows M to grow with n, but imposes 
an additional restriction so that the number of edges of larger size decay 
rapidly. 

The second assumption is stronger than the one in [8], where it was 
shown that a random bipartite 3-uniform hypergraph can be properly 2- 
colored with high probability if the expected number of edges is dn. This is 
due to the use of matrix Bernstein inequality [23] in Lemma [T] that does not 
provide useful bounds in the most sparse case. On the other hand, Chen and 
Frieze [5] use the techniques of Kahn and Szemeredi m that allows them 
to work in the most sparse regime. However, it is not clear how the same 
techniques can be extended even to uniform hypergraphs of higher order. 
Thus, it remains an open problem whether a similar result can be proved 
when the number of edges in the hypergraph grows linearly with n. 


fc-coloring of hypergraphs 

Though Algorithm ICOLOBl has been presented only for the hypergraph 
2-coloring problem, one may easily extend the approach to achieve a k- 
coloring, where the objective is to color the vertices of the hypergraph with 
k colors such that no edge is monochromatic. A possible extension of Algo¬ 
rithm [COLOR| is as follows: 

1. In Step 2, compute the eigenvectors corresponding to the {k — 1) small¬ 
est eigenvalues of A. 

2. Use /c-means algorithm [20| to cluster rows of the eigenvector ma¬ 
trix into k groups, and define the initial guess for the color classes 

,..., in Step 3 according to the above clustering. 

3. The iterative computation in Step 6 is modified by defining 



< i G U : E Aij < E Aij for all I' 


for / = 1, 2,..., (A: - 1), and uf = n (u<fc • 

In the above modification, we borrow the popular idea of using /c-means on 
the rows of eigenvector matrix to find k planted partitions in a graph or 
hypergraph HZllll]. 

We believe that the result in Theorem [1] can be extended to this setting, 
where the random model allows for k planted color classes in the hypergraph 
with non-monochromatic edges generated in the aforementioned manner. 
Assuming k = 0(1) and /c-means algorithm always provides a near optimal 
solution, one can follow the arguments of [TTj to prove a result similar to 
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Lemma [TJ On the other hand, Lemmas [ 2 ] and [3] should hold for an appro¬ 
priate choice of r]. Hence, one can comment that the algorithm achieves a 
proper /c-coloring with probability (1 — o(l)). 

We also note that Algorithm ICOLORI can be used for finding solutions of 
NAE-SAT problems. The extension of ICOLOBl is also applicable for strong 
coloring of hypergraphs, which finds applications in design of communication 
networks |24| . 


Proofs of technical lemmas 


Proof of Lemma [T] 

We view the random matrix A E ]^2nx2n^ ^ perturbation of its expected 

value A = E[A]. Let £ denote the collection of all the non-monochromatic 
subsets of V of size at most M. One can verify that for any i,j E H, i 7^ j 


Aij — 


E 

e£S:e3i,j 


|e| 


and 


Aii= 


e£S:e3i 


|e| 


Counting the number of possible edges of each size, one can see that 



where 


i A j, and i,j belong to same color class, 
if i 7 ^ j, and i,j belong to different color class, 
if i = j, 


M 


E Pm 
— 

m 

m=2 
M 

and 03 = 


Pm 

m 


m=2 

Hence, we can write A as 


2 n — 2 
m — 2 

2n — 2 
m — 1 


A — Cnl2' 


'nx2n 


02 


02 = 


M 

Pm 




m=2 

n — 2 
m — 1 


m 


n — 2 
m — 2 


+ ctshn, 


(3) 


(4) 


where l 2 n is the 2 n-dimensional identity matrix, and l„xn is a n x n matrix 
of all I’s. One can verify that the smallest eigenvalue of A is (03 — 7102 ), 
which has multiplicity 1 , and is separated from the other eigenvalues by an 
eigen-gap of no 2 . Moreover, the corresponding unit norm eigenvector is 
such that for all i < n, and ^ for all i > n, up to a 

* v2n * v2n 

possible change of sign. 
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At this stage, we refer to a well-known result from matrix perturbation 
theory m- We state the result in a particular form that is appropriate in 
our setting. The result, as stated in Theorem [2l has been previously used 
in [131 Lemma 4.4] and [T7] . 

Theorem 2 (Davis-Kahan sin0 theorem). Let A G be a symmetric 

matrix, and A be an additive perturbation of A. Let S gM. be any interval 
that contains exactly k eigenvalues of A. Define 

6 = min{|A — A'| : X G S, X' ^ S, and A, X' are eigenvalues of A}. 

If S > 2||A — AI|| 2 , then S also contains exactly k eigenvalues of A. 

Let X, A G be orthonormal eigenvector matrices for the eigenvalues 
in S of A, A respectively. Then there is an orthonormal (rotation) matrix 
Q G such that 


X-XQ\\f < 


2V^\\A-A\\2 

5 


By viewing A as a perturbation of A and noting that the eigen-gap 
5 = na 2 , one can use Theorem [2] to conclude that if 02 > f ||^ — Al|| 2 , then 


na2 

One can write A as A = ^ j^OeoJ, where, for each set e G T, /le is a 

e£S 

Bernoulli(p|e|) random variable, and Oe G {0,1}^" is such that {ae)i = 1 
only when i G e. Hence, one may view A as a sum of independent random 
matrices. To this end, the following concentration inequality is quite useful 
to derive a bound on the perturbation ||A — A|| 2 . 

Theorem 3 (Matrix Bernstein inequality [23]). Consider a finite sequence 
Xi,X 2 ,...,Xl of independent, random, self-adjoint matrices with dimen¬ 
sion d. Assume that each random matrix satisfies \\Xi — E[A /]||2 < R almost 
L 

surely. Define X = Var(A) = E [(X — E[A])^] , where we 

1=1 

assume all the above expectations exist. Then for all t > 0, 

P(||X-EW|b>»<^exp( ^Var(XH|fij - 

The above result directly implies 

P(||A - ytib > Wnc, Inn) < 4nexp ■ (6) 








We note that choosing d large enough, one can satisfy nai > Inn. Also, 
observe that 


2n 

||Var(A)||2 < max^(Var(A))jj 
i=i 

Substituting these in Q , we have 


2n 

< max Ajj < 4nai. 
i=i 


P(||A — A \\2 > 4i /nai Inn) < 4n 


exp 


Ibnai Inn 
8nai+|nai 


(7) 


= ^ = o(l). 
/n 


Thus, with probability (1 — o(l)) we have ||A — ^||2 < 4-v/nai Inn. Due 
to this bound, one can argue that if na 2 > 8-v/ainlnn, i.e., ^ 
then the condition in Theorem [2] is satisfied, and the preturbation bound (j5]) 
holds. We can compute that 


E Pm /2n-2\ 
m Vm—2/ 

m=2 



^222M+2 

— “m 

m=2 


n2^M+2 

< - 

dlnn 


Hence, choosing d sufficiently large, the above mentioned condition holds, 
and one can claim from ([5]) that 


Ix^ - x^ 


8\/2nai Inn 

2 < - < 

na2 


2 M+ 4.5 

Vd 


Now, we define the set W C H as IT = |i G H : Ixf- — > ^=|. 

L I i 4 I 

From the definition of the color classes , it directly follows that 

any vertex not in IT must be correctly colored. Hence, 


|1T(°)| < |fT| 

< '^2n\xf-xf\‘^ 
iew 

< 2 n || x "^ — x '^||2 



where the bound holds with probability (1 — o(l)). Thus, choosing d suffi¬ 
ciently large, one obtains that < j^ 2 ‘^m +4 ■ 
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Proof of Lemma [2] 

Consider any i < n. Note that i is correctly colored in iteration t if 






or equivalently, 


E 


A. 






1 

<2 




( 8 ) 


Hence, it suffices to show that dS]) holds under the condition stated in the 
lemma. A similar condition can be stated for i > n. 

We note that E ^ and so, from Bernstein in- 

j^i e£S:e3i 

equality, we have 


P 


— 

j¥=i 




< exp 


Vl+4 I Xy 


22M+4 


V 


2 E n^T^Var(/ie) + 1+2 E Aj 

eee-.e3i j^i 




< exp —H I Aj 


< n 




The second inequality holds since for any e, ^^^jL|^Var(/ie) < 
the last inequality is true under the condition of Theorem [1] since 


'^Aij = (2n - 1 )q;i + (n - 1)02 
j¥=i 


M 

E 

m=2 


Pruim - 1 ) 

2n 


2n 


m 


- 2 


n 


m 


= Q{dlnn). 


Denoting [n — i] = {1,..., n}\i, i.e., the first color class excluding vertex i, 
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we have -^ij 


and one can bound 

e£E:e3i 


I X] 2^+2) X/ j 

\je[n-i] ^ jeln-i] / 


< exp 


V 


1 

22M+4 


de[n-i] 


2 E Var(h.)l£fE + ^ 

e£€:e3i j(^[n—i] 


Y1 


Thus, with probability (1 — n we have 

1 


Ajj < ( 1 + 

jeln-i] 


E A 


2M+2 I ^ •^'■1 

j&[n-i] 


M 

E 

m=2 


Pm{n -l)( 1 


m 


1 + 


2 n — 2 \ f n — 2 


2 M +2 \\m-2 \m-2 


and 


2 ^+ 2 ! 






m=2 


yPrnl^ 


1 


2 M +2 


(2n - 1) 


2n — 2 
m — 2 


Using above relation, we can derive ([8]) since 


E = E 


Aij + 


- (n- 1) 


E 


n — 2 
m — 2 


Ai 


iev/* ^\{i} iew(*-i)ny/‘ ^\{i} ievf ^\(vy(‘-i)n{i}) 

— E + E 

jeiy(t-i)\{i} ie[n-i] 

1 


< r/ + 1 + 


E A 


2M+2 J ^ 
je[n-i] 


The first inequality uses the fact u/* is the set of correctly colored 

nodes, with true color same as i. Hence, u/* n {i}) C [n — i]. 
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From definition oi rj, we have 


E 






M 

sE 

m=2 

M 

= E 

m=2 

M 

+ E 


m=2 


Pmjn - 1) 
m 

Pmjn - 1) 
m 

Pmjn - 1) 
2m 




2 M +2 


2 n — 2 
m — 2 


n — 2 
m — 2 


1 - 

1 


2 M +2 

2 n — 2 
m — 2 


2n — 2 
m — 2 

n — 2 
m — 2 


1 /n -2 
2 — 2 
M 


Pmjn - 1) ^’ 


m=2 


n — 2 
m2^+3 \^m — 2 


One can see that the first term is at most ^ (l — 2 M+^ ) ^ 

On the other hand, we note that 



1 

_ \m/ 

- 4 


1 

. s < - 

/nN - ^ n" 
\m) 


(2nY 


= 2^ <2 


M 


4.m! 


So the second term is negative, which proves ([ 8 ]), and the claim follows. 


Proof of Lemma [3] 


Let Ci,C 2 C V he arbitrary such that \C 2 \ = b, and Ec^C 2 be the set of all 
non-monochromatic subsets of V of size at most M that have non-empty 
intersection with both Oi and C 2 . Then 


E E 

^ - M ^ e 

eSEciC2 eei?CiC2 



E E 

ieC2 isCi\{i} 


where the last inequality holds under the condition stated in the lemma. 
Now we bound the probability 


P I 3Ci,C2cr,|C2| = lie’ll 


jVf2 22Ai+5 


, ^ Aij > ?7 Vi G 02 I (9) 


ieCiUd 


< ^ P I 3 Ci,C 2 C L, IC 2 I = ^|Oi| = 6 , and he > 


b=l 


j \^2 22M+5 


eeEciCj 


M 


s E E E 0 E 

h=l C2.\C2\=b Ci:\Ci\=2b \eeEc 1 C 2 
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We observe that 


M 


Y. = Z] Z 

e6EciC2 m=2eeEciC2.b|=m 


M 


< 


26 ^ Y { 

m=2 ^ 


(2n — 2 


M 


< 622«+i p„. 


m — 2 
n — 2 


m=2 


m — 2 


< 


b‘^riM2^^^+^ 


n 


and the above bound is smaller than for b < "■ 

write 


±Kj± w \ Hence, we can 


T K>^ 


ye&Ec^C2 
< exp 


-(§-E 




2Ee.-. Var(M + i ^-E 


^eeEciC2 
f 3br]\ 


3 Z^eSEci C2 


Substituting in ([U]), we have the probability of the existence of Ci,C 2 with 
mentioned conditions is at most 


n 



Under the assumption of Theorem [U one can verify that t] > ^ 2 m +4 ■ So for 
large d, the above geometric series converges, and is at most = o(l). 

Hence, the claim. 
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